NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Progressive and Scalable Hotspot Detection through Local K-Function in Spatial Networks

https://doi.org/10.1145/3766549

Kang, Yunfan; Liu, Yongyi; Juvekar, Pratham; Mahmood, Ahmed; Wang, Shaowen; Magdy, Amr (September 2025, ACM Transactions on Spatial Algorithms and Systems)

The widespread availability of geotagged data combined with modern map services allows for the accurate attachment of data to spatial networks. Applying statistical analysis, such as hotspot detection, over spatial networks is very important for precise quantification and patterns analysis, which empowers effective decision-making in various important applications. Existing hotspot detection algorithms on spatial networks either lack sufficient statistical evidence on detected hotspots, such as clustering, or they provide statistical evidence at a prohibitive computational overhead. In this paper, we propose efficient algorithms for detecting hotspots based on the network local K-function for predefined and unknown hotspot radii. The K-function is a widely adopted statistical approach for network pattern analysis that enables the understanding of the density and distribution of activities and events happening within the spatial network. However, its practical application has been limited due to the inefficiency of state-of-the-art algorithms, particularly for large-sized networks. Extensive experimental evaluation using real and synthetic datasets shows that our algorithms are up to 28 times faster than the state-of-the-art algorithms in computing hotspots with a predefined radius and up to more than four orders of magnitude faster in identifying hotspots without a predefined radius. Additionally, to address dynamic changes in the spatial network, we propose an incremental hotspot detection approach that efficiently updates hotspot computations by leveraging prior results as new events are added.
more » « less
Free, publicly-accessible full text available September 11, 2026
Pyneapple-L: Scalable Expressive Learning-based Spatial Analysis

https://doi.org/10.1145/3678717.3691228

Liu, Yongyi; Lee, Nicolas; Kang, Yunfan; Shahneh, Mohammad Reza; Mahmood, Ahmed; Chinnam, Vishal Rohith; Sarawadekar, Aparna Vivek; Oymak, Samet; Sabek, Ibrahim; Magdy, Amr (October 2024, ACM)

Full Text Available
Pyneapple-R: Scalable and Expressive Spatial Regionalization

https://doi.org/10.1109/ICDE60146.2024.00437

Kang, Yunfan; Liu, Yongyi; Alrashid, Hussah; Bilgi, Akash; Purohit, Siddhant; Mahmood, Ahmed; Rey, Sergio; Magdy, Amr (May 2024, IEEE)

Full Text Available
Scalable Evaluation of Local K-Function for Radius-Accurate Hotspot Detection in Spatial Networks

https://doi.org/10.1145/3589132.3625646

Liu, Yongyi; Kang, Yunfan; Mahmood, Ahmed; Magdy, Amr (November 2023, The 31st ACM International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL '23))

The widespread of geotagged data combined with modern map services allows for the accurate attachment of data to spatial networks. Applying statistical analysis, such as hotspot detection, over spatial networks is very important for precise quantification and patterns analysis, which empowers effective decision-making in various important applications. Existing hotspot detection algorithms on spatial networks either lack statistical evidence on detected hotspots, such as clustering, or they provide statistical evidence at a prohibitive computational overhead. In this paper, we propose efficient algorithms for detecting hotspots based on the network local K-function for predefined and unknown hotspot radii. The network local K-function is a widely adopted statistical approach for network pattern analysis that enables the understanding of the density and distribution of activities and events in the spatial network. However, its practical application has been limited due to the inefficiency of existing algorithms, particularly for large-sized networks. Extensive experimental evaluation using real and synthetic datasets shows that our algorithms are up to 28 times faster than the state-of-the-art algorithms in computing hotspots with a predefined radius and up to more than four orders of magnitude faster in identifying hotspots without a predefined radius.
more » « less
Full Text Available
PRUC: P-regions with user-defined constraint

https://doi.org/10.14778/3494124.3494133

Liu, Yongyi; Mahmood, Ahmed R.; Magdy, Amr; Rey, Sergio (November 2021, Proceedings of the VLDB Endowment)

This paper introduces a generalized spatial regionalization problem, namely, PRUC ( P -Regions with User-defined Constraint) that partitions spatial areas into homogeneous regions. PRUC accounts for user-defined constraints imposed over aggregate region properties. We show that PRUC is an NP-Hard problem. To solve PRUC, we introduce GSLO (Global Search with Local Optimization), a parallel stochastic regionalization algorithm. GSLO is composed of two phases: (1) Global Search that initially partitions areas into regions that satisfy a user-defined constraint, and (2) Local Optimization that further improves the quality of the partitioning with respect to intra-region similarity. We conduct an extensive experimental study using real datasets to evaluate the performance of GSLO. Experimental results show that GSLO is up to 100× faster than the state-of-the-art algorithms. GSLO provides partitioning that is up to 6× better with respect to intra-region similarity. Furthermore, GSLO is able to handle 4× larger datasets than the state-of-the-art algorithms.
more » « less
Full Text Available
SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data

https://doi.org/10.1145/3460013

Daghistani, Anas; Aref, Walid G.; Ghafoor, Arif; Mahmood, Ahmed R. (June 2021, ACM Transactions on Spatial Algorithms and Systems)
null (Ed.)
The proliferation of GPS-enabled devices has led to the development of numerous location-based services. These services need to process massive amounts of streamed spatial data in real-time. The current scale of spatial data cannot be handled using centralized systems. This has led to the development of distributed spatial streaming systems. Existing systems are using static spatial partitioning to distribute the workload. In contrast, the real-time streamed spatial data follows non-uniform spatial distributions that are continuously changing over time. Distributed spatial streaming systems need to react to the changes in the distribution of spatial data and queries. This article introduces SWARM, a lightweight adaptivity protocol that continuously monitors the data and query workloads across the distributed processes of the spatial data streaming system and redistributes and rebalances the workloads as soon as performance bottlenecks get detected. SWARM is able to handle multiple query-execution and data-persistence models. A distributed streaming system can directly use SWARM to adaptively rebalance the system’s workload among its machines with minimal changes to the original code of the underlying spatial application. Extensive experimental evaluation using real and synthetic datasets illustrate that, on average, SWARM achieves 2 improvement in throughput over a static grid partitioning that is determined based on observing a limited history of the data and query workloads. Moreover, SWARM reduces execution latency on average 4 compared with the other technique.
more » « less
Full Text Available
An experimental evaluation and investigation of waves of misery in r-trees

https://doi.org/10.14778/3494124.3494132

Xing, Lu; Lee, Eric; An, Tong; Chu, Bo-Cheng; Mahmood, Ahmed; Aly, Ahmed M.; Wang, Jianguo; Aref, Walid G. (November 2021, Proceedings of the VLDB Endowment)

Waves of miseryis a phenomenon where spikes of many node splits occur over short periods of time in tree indexes. Waves of misery negatively affect the performance of tree indexes in insertion-heavy workloads. Waves of misery have been first observed in the context of the B-tree, where these waves cause unpredictable index performance. In particular, the performance of search and index-update operations deteriorate when a wave of misery takes place, but is more predictable between the waves. This paper investigates the presence or lack of waves of misery in several R-tree variants, and studies the extent of which these waves impact the performance of each variant. Interestingly, although having poorer query performance, the Linear and Quadratic R-trees are found to be more resilient to waves of misery than both the Hilbert and R*-trees. This paper presents several techniques to reduce the impact in performance of the waves of misery for the Hilbert and R*-trees. One way to eliminate waves of misery is to force node splits to take place at regular times before nodes become full to achieve deterministic performance. The other way is that upon splitting a node, do not split it evenly but rather at different node utilization factors. This allows leaf nodes not to fill at the same pace. We study the impact of two new techniques to mitigate waves of misery after the tree index has been constructed, namely Regular Elective Splits (RES, for short) and Unequal Random Splits (URS, for short). Our experimental investigation highlights the trade-offs in performance of the introduced techniques and the pros and cons of each technique.
more » « less
Full Text Available
SSTD: a distributed system on streaming spatio-textual data

https://doi.org/10.14778/3407790.3407825

Chen, Yue; Chen, Zhida; Cong, Gao; Mahmood, Ahmed R.; Aref, Walid G. (August 2020, Proceedings of the VLDB Endowment)
null (Ed.)
Full Text Available
LocationSpark: In-memory Distributed Spatial Query Processing and Optimization

https://doi.org/10.3389/fdata.2020.00030

Tang, Mingjie; Yu, Yongyang; Mahmood, Ahmed R.; Malluhi, Qutaibah M.; Ouzzani, Mourad; Aref, Walid G. (October 2020, Frontiers in Big Data)
null (Ed.)
Full Text Available
Prompt: Dynamic Data-Partitioning for Distributed Micro-batch Stream Processing Systems

https://doi.org/10.1145/3318464.3389713

Abdelhamid, Ahmed S.; Mahmood, Ahmed R.; Daghistani, Anas; Aref, Walid G. (May 2020, SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records